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1. Introduction 



Since Stein introduced his method for normal approximation in 1972, much 
has been developed for normal approximation in one dimension for dependent 
random variables for both smooth and non-smooth functions. A typical non- 
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smooth function is the indicator of a half line. Three approaches have been devel- 
oped to deal with non-smooth functions: the induction approach popularized by 
Bolthausen (1984), the recursive approach of Raic (2003) and the concentration 
inequality approach developed by Chen (1986), Chen (1998), Chen and Shao 
(2001) and Chen and Shao (2004). 

Although Stein's method has been extended to multivariate normal approx- 
imation (see, for example, Barbour (1990), Gotze (1991), Goldstein and Rinott 
(1996), Chatterjee and Meckes (2008), Reinert and Rollin (2009)), relatively few 
results have been obtained for non-smooth functions, typically for indicators 
of convex sets in finite dimensional Euclidean spaces. In general, it is much 
harder to obtain optimal bounds for non-smooth functions than for smooth func- 
tions. As far as we know, results for non-smooth functions are those of Gotze 
(1991), Rinott and Rotar (1996) and Bhattacharya and Holmes (2010), which 
is an exposition of Gotze's result. While the result of Rinott and Rotar (1996) 
is for bounded locally dependent random vectors, those of Gotze (1991) and of 
Bhattacharya and Holmes (2010) are for independent random vectors with finite 
third moments. The approach of Gotze (1991) and of Bhattacharya and Holmes 
(2010) is by induction. 

In this paper, we extend the concentration inequality approach to the mul- 
tivariate setting. We prove that for W = J27=i being a sum of independent 
random vectors, standardized so that EW — 0, EVFIU 71 = Ikxk, 

p(W"W g A 4 "< +£ \A A i) < 4.1fc 1 / 2 e + 39k 1 / 2 -/ (1.1) 
and with | • | denoting the Euclidean norm of a vector, 

F(W G A^ Xi \A 4 ^) < 4.1fc 1 / 2 E|A 4 | + 39fc 1/2 7 (1.2) 

where A is a convex set in M. k , A e = {x G R fe : d(x,A) < e} for e > 0, 
= W-Xi and 7 = E|X;| 3 . Using these concentration inequalities, 

we prove a normal approximation theorem for W with an error bound of the 
order fc 1 / 2 7. This dependence of k 1 / 2 on the dimension is better than k 5 ^ 2 and 
fc 3 / 2 obtained by Bhattacharya and Holmes (2010) and k as stated in Gotze 
(1991). Our concentration inequality approach provides a new way of dealing 
with dependent random vectors, for example, those under local dependence, for 
which the induction approach is not likely to be applicable. 
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Comparing our result with those assuming finite third moments and using 
other methods in the literature, only the result of Bentkus (2003) gives a bound 
depending on fc 1 / 4 , which is better than fc 1 / 2 . But his result is for i.i.d. random 
vectors. Other results for i.i.d. random vectors, for example, by Nagaev (1976), 
Senatov (1980) and Sazonov (1981) depend on k. 

This paper is organized as follows. In section 2, we develop techniques for 
the concentration inequality approach in the multivariate setting. In section 3, 
we use the concentration inequality approach to prove a multivariate normal 
approximation theorem for sums of independent random vectors. In section 4, 
we prove the technical lemmas in Section 2. 

Throughout the paper, let | • | denote the Euclidean norm of vectors, and 
let || • || denote the operator norm of matrices. Let djf denote the first partial 
derivative of / along the coordinate j. For a positive integer k, [k] = {1,2, ... ,k}. 
Finally, let Ikxk denote the k by k identity matrix. 

2. Concentration inequalities 

As a powerful tool of proving distributional approximations along with er- 
ror bounds, the theory of Stein's method has been extensively developed in 
the literature for random variables with all kinds of dependence structure. 
While it works well for smooth function distances, it requires much more ef- 
forts to obtain optimal bounds for non-smooth function distances such as the 
Kolmogorov distance. To overcome this difficulty, we consider the probabil- 
ity for some random variable W taking values in a small interval [a, b\. A 
bound on P(W G [a, b]) is called a concentration inequality. Now if W is a 
fc-dimensional random vector and Z is a fc-dimensional standard Gaussian ran- 
dom vector, the non-smooth function distance between C(W) and C(Z) usually 
means sup^g^ \P(W G A) — P(Z G A)\ where A denotes the set of all con- 
vex sets in M. k . A concentration inequality in this setting would be a bound on 
P(W G A e \A) where A e = {x G M fe : d(x, A) < e} where d(x, A) = mi yeA \x-y\. 

For a given convex set A C M. k , e > 0, we define / = f(A, e) = (/i, fa, . . . , fk) T ■ 
M. k — > K fe as follows. For x £ A where A is the closure of A, f(x) = 0. For 
x G A^A, find xq the nearist point in A from x, and define f(x) — x — xq. 
For x G M fe \j4 e , find Xq the nearist point in A from x, and x\ the intersec- 
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tion of {#0 + t(x — xo) : t E [0, 1]} and dA e , the boundary of A e , and define 
/(x) = xi — xo — f(xi). We have the following four lemmas regarding to the 
properties of the above defined /. 

Lemma 2.1. We have 

\f\<e. (2.1) 

Lemma 2.2. For all £,r) E M fc , 

e-(/(r? + 0-/W)>0. (2-2) 

Lemma 2.3. For every i E [k] and any fixed x\, . . . , Xj_i, Xj+i, . . . , x n , fi is 
absolutely continuous in x,; and 

difi(x) > a.e.. (2.3) 

For x € ( J 4 e )°\A, where A° is the interior of A, we have a shaper lower bound 
for difi(x). Let 6 = (61,62, ■■• , Qk) T be the angles between x — xo and the axes. 

Lemma 2.4. For all i E [k], x E (A e )°\A, 

dj 4 (x) > cos 2 ^ a.e.. (2.4) 

We defer the proofs of the lemmas to Section 4. To obtain a concentration 
inequality for a random vector W of interest, we apply the above defined function 
/ in the Stein identity for W. We consider the following two cases: multivariate 
Gaussian vectors and sums of independent random vectors. 

2.1. Multivariate normal distribution 

Proposition 2.5. Let Z = (Z\, Z2, ■ ■ ■ , Zk) T be a k- dimensional standard 
Gaussian random vector. Then for any convex set A in M fc and e%, €2 > 0, 

F(ze4 fi \r ! )a% + f2 ) (2.5) 

where A e = {x E R k : d(x, A) < e} and A~ c = {x E R k : B(x, e) C A} where 
B(x, e) is the k-dimensional ball centered in x with radius e. 
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Proof. From the joint independence among \Z\, Z2, ■ ■ ■ , Zk} and the integration 
by parts formula, we have the following fc functional identities for Z . 

EZ 1 f 1 (Z)=Ed 1 f 1 (Z), 

(2.6) 

EZ k f k (Z) =Ed k f k (Z). 

Using the function / = /(A, e) defined at the beginning of this section where 
A is a convex set in R fe and e > and summing up the above k equations, we 
have 

k k 

Y,^Z j f j {Z)=Y,^d j f j {Z). (2.7) 

i=i j'=i 

By Lemma 2.1, LHS of (2.7)< k 1/2 e. By Lemma 2.3 and Lemma 2.4, 



k 

RHS of (2.7) > J2 E 9jfj(Z)I(Z 6 (A")°\A) 

k 

> eJ2 cos2 Oj J i z g (^ e )°\^) = HZ G (A C )°\A). 



(2. 



Therefore, 

P(Z € A e \A) < fc 1 /^. (2.9) 

The bound (2.5) can be deduced from the above inequality by the arguments in 
Section 1.3 of Bhattacharya and Rao (1986) sketched as follows. 

Without loss of generality, assume A° ^ 0. First suppose A is bounded. Given 
any 8 > 0, we may choose x\, x%, . . . ,x n € dA such that dA C {xi, . . . , x n } s . 
Let P be the convex hull of {xi, . . . ,x n }. By taking S small enough, P° ^ 0. 
For some positive integer m, P can be expressed as 

P = {x e R k : iij ■ x < dj, 1 < j < m} 

where u^'s are distinct unit vectors and dj's are real numbers. For each real a, 
define 

P a = {x e M fe : Uj ■ ■ x < dj + a, 1 < j < to}. 
Then from the fact that P C A C P 5 , we have 

A^\A~^ c (P 5 r\P^ 2 c Pe I+ *\P- 63 . 
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Therefore, 

F(Z G A^\A-^) < P(Z e Pe 1+S \Pe 2 ) = f 1+ I <t>d\ k -ida (2.10) 

J-e 2 JdP a 

where </> is the density of standard fc-dimensional normal distribution and Afc_i 

is the Lebesgue measure in ]R fc_1 . We used Lemma 3.9 in Bhattacharya and Rao 

(1986) in the last equality. From the arguments leading to (3.35) in Bhattacharya and Rao 

(1986), 

\V(Z G (PaY\P a ) - e [ 4>d\ k ^\ < o(e). 

JdP a 

The above inequality and (2.9) result in 

/ <j>d\ k - x < fc 1/2 . 

JdP a 

Therefore, from (2.10), 

P(Z G A £l \A- e2 ) < k 1/2 (e! +e 2 + 5). 

The bound (2.5) is proved by letting S — s- 0. If A is unbounded, consider A r — 
A n B(0, r) and let r -> oo. □ 

Remark 2.6. It is known that F(Z G ,4 ei \,4~ e2 ) < 4fc 1/4 (ei + e 2 ), w/w'c/i is o/ 
optimal order in k (see Ball (1993) and Bentkus (2003)). It is not clear how we 
can obtain k 1 ^ 4 in the bound by our approach. 

2.2. Sum of independent random vectors 
Proposition 2.7. Let k- dimensional random vector W be 

n n 

w = (w u ...,w k ) T = ^x l = ^pr ll ,x 42 ,...,x^) T 

i=l i=l 

where {Xi : t £ [n]} are independent random vectors such that EX^ = and 
EWW T = I kxk . Then, for any convex set A in Mr, 

P{W^ G A 47+e \A 4 T) < <l.lk 1/2 e + 39fc 1/2 7 (2.11) 

and 

P(W G A 4 ^ X '\A^) < 4.1fc 1 / 2 E|X,| + 39fc 1/2 7 (2.12) 
for anye>0 and i G [n] where = W-Xi and 7 = £?=i 7i = £? =1 E|X ?; | 3 . 
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Proof. With out loss of generality, assume 7 is finite. In this proof, let Y^j'jtj" 
denote J2">=i J2j»< n ,j»^jn and for a fixed i, let J2j& denote J2j< n ,j^v We use 
/ = f(A, e + 87) defined at the beginning of this section in the following Stein 
identity for W^> . 

EW« ■ f(W&) = J2 ' (f(W {l) ) ~ /(W*> - X,)). (213) 

Because |/| < e + 87, LHS of (2.13) < k^ 2 (e + 87). From Lemma 2.2, 
RHS of (2.13) 

> J2 ex j ■ (f( w(i) ) - f( w<A - x i)) I {\ X 3\ < e A e+4 ~<\A 4 ~<) 

k 

= E E { E (~ X i ■ h n')(f(W {t) X 3 ) ■ h 3y f(W&) . hjj ,)} 

x I(\Xj\ < ^)I(W {l) e A e+i ~<\A^) 
where we used the orthonormal basis {hjx, . . . , hjk} for each j =/= i defined as 
follows. For each JfW = ujW g A e+4rr \A iri and Xj = Xj, define an orthonormal 
basis {hji , . . . , hjk} such that /iji and 10 W — are parallel and hj 2 and — x 3 — 
(—Xj ■ hji)hji are parallel (0-vector is parallel to any vector). Recall that Wq 
is the nearist point in A from w^ 1 '. Then, 

RHS of (2.13) 

> E E {(-^ • thi)(J(W® + (-x 3 ■ hrfhn) ■ h n - f(w^) ■ h 3l ) 

+ (-Xj ■ h jX ){f{W^ - X 3 ) ■ h 3l - f(W® + (-X 3 ■ h 3l )h 3l ) ■ h 3l ) 
+ (-Xj ■ h j2 )(f(W^ + (-Xx • h 3l )h 3l ) ■ h 32 - f(W®) ■ h j2 ) 
+ (-Xj ■ h J3 )(f(W® - X 3 ) ■ h j2 - f(W® + (-Xj ■ h 3l )h 3l ) ■ h j2 ) 

x I(\Xx\ < Aj)I(W {l) £ A C+ ^\A^). 
If w« e A £+4 t\A 4 t, \xj\ < 4 7 , then we have 

/(«;« + (-ajj ■ MM ' fyi - ' hji = -x 3 - • h jl: (2.14) 

/(„,« + (-^ . hjx)hjx) ■ h j2 - /(#) ■ = (2.15) 

and 

(-2^ ■ h j2 )(f(w {l) - x 3 ) ■ h j2 - f(w [l) + (-Xj ■ hji)hji) ■ h j2 ) 

(2.16) 

> - X 3 ) ■ hjx - f(w® + (~Xj ■ hj X )hjx) ■ hjxf. 
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Equations (2.14) and (2.15) follow from f(w^ + (-Xj ■ hji)hji) = f(w^) + 
{—Xj -hjx)hj\. For (2.16), consider the plane p parallel to hji, hj2 and containing 
. Let I be the line parallel to hj% and containing Wq \ The line I divides p 
into two parts p\,P2 where p\ is closed and P2 is open and contains Draw 
a circle on p with diameter [wq\ — Xj]. Then (w® — Xj)', the projection 
of (v)W — Xj)o on p, must be inside the circle (or on the perimeter) and on p\ 
because of the convexity of A. Let (iyW — X j)" be the projection of iuW — X j 
on I, and let (w^ l > — Xj)'" be the projection of (w' 1 ' — Xj)' on I. Then, (2.16) 
follows from 

which is a consequence of the fact that the angle between (W^' — Xj)" — 
(yy(i) - XjY&nd W W - (W® - Xj)' is greater than or equal to tt/2. Using 
ab > -a 2 - b 2 /4, 

(-Xj ■ hji)(f(w^ - Xj) ■ hji - f(w {l) + (-Xj ■ hji)hj±) ■ hji) 

> - ( ~ Xj 4 kjl)2 (/(«,« - Xj) ■ hji - /(«,« + (-Xj ■ hj l )hj l ) ■ hji) 2 . 

(2.17) 

Apply (2.14)-(2.17), we obtain a lower bound of RHS of (2.13) as 

RHS of (2.13) > J^E(-X, • hj 1 ) 2 I(\X j \ < ^)I(W (l) G A €+4 ^\A 4 ^). 

(2.18) 

In other words, we have 

RHS of (2.13) > \^^(Xj .f(W«)) 2 I(LX>| < 4 7 )/(VF (i;) G A e+ ^\A^) 

= R 

(2.19) 

where f(W«) = (W^ ] - W^)/\W^ ] - W®\ for G A e+i ^\A^ and 

is the nearist point in A from . We may define £(Ww) to be e%, where 

{ei, . . . , ek} are the original orthonormal basis when A e+47 \^4 47 , since 
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it does not affect the value of R. We now obtain a lower bound of R. 
R 

k 

= \ E E E XfrttW^'WA < 47)/(^ (<) € A^\A^) 

j^Li j' = l 

+ iE E E ^•i'^i''e(W«) 3 -^(wW) J -»/(|Jr J -| < 47)/(wW G A 6 +*V*") 
For R x , 

Rx 

= \Y. EI ( w(i) e ^ 4 7 ) 



4 



E£*Ml*il<*y)] 



A' 



— + Ri,2- 
Using the inequality 



ab<-fa 2 + ^, (2.20) 
47 



lflx.il < f E Umw^Y, + ^-E[^X^J(|X,| < 47) 

i'=l k ' j# 

- E E^' / (i^i < 4 7)] 2 ) 



ft 



|{7 E ^(^ (i) ). 4 ' + J- E Var (E < 4 7))} 



ft 



< J{7 E E ^ w )* + i EE E 4 J (fti < 47)}. 



4 L ^ J 4 7 ^ ■ 

j'=i 1 j'=i j^i 
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For R ia , 

R1.2 = lt KI(W® 6 A^\A^(W^,[Ej2x!r -E^X^IQXA > 4 7 )] 

i'=l j^i j¥=i 

, k 

>jJ2 Ei W W G A e+4 ^\A 4 ^(VF (i) ) 2 ,(l - EXfj, ) 



4 



Z^\ X if 



- - E E/ (w (i) € ^ e+47 \^ 47 K(w (i) )|,E^ 

> (1 - 7 2 / 3 )^P(M^ (4) g A £+47 \A 4 t) - ^P{W (l) G A e+4 ''\A 4 T) 

where we used the facts that E^l 2 < 7 2 / 3 and = 1 in the last 

inequality. 

i^i j'¥=j" 

= 7 E ET(W W G A e+4 ''\A 4 T)e(^ w ) / e(^ w ) / - 

x (E x «'^-" J (i^i ^ 4 ^) - E E^'^" J (i^-i < 4 7)) 

= i?2,l + i?2,2 

For i?2,i, using the inequality (2.20), 

\r»,i\ < \ei{w {i) g ^ £+47 \a 4 t) ie(w (i) )j'e(w w )i»i 

jVj" 

x \Y t X ji .X jj ..I{\X j \ < Arfi-E^XtfXjrlQXA < 4 7 )| 

12 



^ I E E {7K(^ w M(w^V] 

| (gj^i^^W < 4 7) ~ E x jj' x jj" I( y\ X ] I < 47)) 2 



4 7 

<T E E^^^^'f + jxi-^ 2 E (^y^)^(l^l<4 7 ). 

From the bounds on |-Ri,i| and l-R2.1l, 

\Ri,i\ + \R*,i\ < y + iL E E i^l 4/ d^-| ^ 4 7) < y ■ 
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A lower bound of i?2,2 can be obtained as follows. Let W^> be an independent 
copy of WW . 

= \ E EI ( wii) e A t+ ^\A^)^W {l) ) r ^{W {l) ) ri 
x ^X jj ,X jj ,,--E^2x Sj ,X jjt ,I(\X j \ > 4 7 )] 

j^i j¥=i 

> -^E|X i | 2 P(W r W e A e+4 ~*\A 4 ~«) 

^EI{W® G A^\A^) £ ^{W^y^W^^nx^X^mx^ > 4 7 ) 

i'¥=j" j=£i 

> -^ 7 2/3 P(^ (i) G ^ e+47 \^4 47 ) 

- iE E E/ (^ e ^ +4 A^ 47 )i^e(^)i^i^»e(^«k^(i^-i > 47) 

j¥* j'¥=j" 

> -^ 7 2 / 3 P(^ (i) G ^l e+47 \^ 47 ) - ^EI(W& G A e+4 T\A 4 T) ^ IJCyflflX,-! > 4 7 ) 

> -^7 2/3 P(W w G A e+4 T\A 4 T) - ^P(W (i) G A t+A ^\A^) 

where we used the facts that ^^Zj^XjjiXjjn = — E_Xy<Xy// for f 7^ j" and 
Ej'=i I^O^'I < 1^1- Therefore, 

RHS of (2.13) 

> (1 - 7 2/3 )^P(t¥ (l) G A e+4 T\A 4 T) - ^|p(W w G A t+ ^\A^) - ^ 
- ^ 7 2 / 3 P(V^M g A t+ ^\A^) - ^P(W W G A e+ ^\A^). 

Recall that LHS of (2.13) < fc 1 / 2 (e + 87), we have 

(§ - | 7 2/3 )P(^ (i) e A e \A) 

8 2 „ (2.21) 

<fc 1 /2 e + 8fc i/2 7+ | 7 . 

When 7 > 1/39, (2.11) is true. When 7 < 1/39, (2.11) is obtained by solving 
(2.21). 

To prove (2.12), let f x ' = f(A, \Xi\ + 87) be defined at the beginning of this 
section. Consider the following Stein identity, 

EW W • f x *(W) = Y, EX 3 ■ if Xi (W) - f X '(W - Xjf). (2.22) 
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We have 

E|wW|(|Jsq| + 8 7 ) 

> 51 Ex ? ' (f Xi (W) - f x *(W - Xj))I(W e A^+^A^IQXA < 4 7 ). 

The bound (2.12) can be proved by applying the same argument leading to 
(2.11). □ 



3. Multivariate normal approximation 

In this section, we prove a multivariate normal approximation result (Theorem 
3.5) by applying the concentration inequality approach in Stein's method. A 
multivariate version of the Stein equation was given in Gotze (1991) as well as 
in Barbour (1990) as follows. 

Af(w) - w ■ V/(u>) = h(w) - Eh(Z) (3.1) 

where h is a test function and Z is a standard fc-dimensional Gaussian random 
vector. 

If the test function h is smooth enough, the above equation can be solved 
and one of its solution can be expressed as 

1 f 1 1 



f(w) = — / / [hiVT^sw + y/sz) - Eh{Z)](j)(z)dzds (3.2) 

2 Jo 1 ~ s Jr" 

where 4>(z) is the density function of the fc-dimensional standard normal dis- 
tribution at z 6 K*. When V/i is Lipschiz, the second derivatives of / can be 
calculated as 

i r 1 ! 



djj'f(w) = — ~ h(\/l — sw + \fsz)djji 4>{z)dzds 

27c2 2 SV (3.3) 

1 r e i r 

+ — / —j= I djih{yl — sw + y/sz)dj(j)(z)dzds 

For each test function h = I a where A is a convex set in R fc , a smoothed 
version of it was introduced by Bcntkus (2003) 

KH^^ElR) (3.4) 
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where e > and function ip is defined as 

f , x < 

l-2x 2 , < x < \ 
2(1 -x) 2 , \ < x < 1 
0, 1 < x. 

The next lemma was proved in Bcntkus (2003). 
Lemma 3.1. The above defined function h e satisfies: 

h e (w) = 1 for w G A, h e (w) = for w G M fc \yl £ , < h e < 1, 

and 



(3.6) 



\Vh e (w)\ < -, |V^K) - Vh e (w 2 )\ < 8|W1 2 W2 ' . (3.7) 
For a convex set A and 7 > 0, defining gi^ e = h t for h = Ij^-i , we have 

P(W G A) - P(Z ei)< P(W G A 41 ) - V(Z G A) 

< E5i, e (W) - Effi, e (Z) + Effi, e (Z) - P(Z G A) 

< E 5 i, e (W) - E 51 , £ (Z) + P(Z G A 4 ^\A) 

< E 9l , e (W) - Eg he (Z) + k 1 ' 2 ^ + e) 
where we used (2.5). If A^-^ = 0, 

P(W G A) - P(Z e A) > -P(Z G 7l\^ e " 47 ) > -fc 1/2 (4 7 + e). 

If not, defining 92.6 = h £ for h = I(A-e-4-y\i~,, we have 

P(W G A) - P(Z G A) > Eff 2>e (W) - Kff2, e (^) + Eff 2 ,e(^) - P(Z G A) 

> Eg 2 , e (W) - E 52 , e (Z) - V(Z G A\A- £ ~ 4 t) 

> E ff2>e (W) - E ff2 , e (^) - fc 1/2 (4 7 + e). 
Therefore, we have the following smoothing lemma. 

Lemma 3.2. For any k- dimensional random vector W , 
sup \¥(We A)-P(ZeA)\ < sup \Eh e (W) - Eh e (Z)\ + fc 1/2 (e + 4 7 ) 

AeA h=I A i y :AeA 

(3.8) 

where Z is a standard k- dimensional Gaussian random vector, A is the set of 
all the convex sets in R k , e > 0, 7 > and h e is defined as in (3.4)- 
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The following lemma from Bcntkus (2003) will be used in this section. 
Lemma 3.3. For a k- dimensional vector x, 



f \Y\xjdj<j){z)\dz < \-\x\, 



(3.9) 



/ | ]T x j x f x j „d jj , j „<f>(z)\dz<2 i^\x\ 3 . (3.10) 

Using the same argument as in Bentkus (2003) when proving Lemma 3.3, we 
obtain the following lemma. 



(3.11) 



Lemma 3.4. For k- dimensional vectors u,v, we have 

[ I U 3 V 3' V 3" 9 30'3"<i>( Z )\ dz < 2 (! + \ h\ u \\ v \ 

jj'j"=1 

Proof. It is straightforward to verify that 

k 

UjVj'VfidjffXpiz) 

3,3',3"=1 (3-12) 

= (M 2 (u ■ z) + 2(u ■ v){v -z)-(u- z)(v ■ zf)(j){z). 

From (3.12), we only need to consider the projection of z in the two-dimensional 
space spanned by vectors u, v. Therefore, the constant obtained is dimension free 
and the rough upper bound (3.11) is calculated as follows. Let Z\,Z 2 be two 
independent 1-dimcnsional standard Gaussian variables, then 
k 



jRk jJ>J»=l 



< (E|3Zi - Z\\ + E\Z 2 (1 - Z'()\) < 2(1 + \j-)\u\\v\ 



□ 



T 



T 



Theorem 3.5. Let k-dimensional random vector W be 

n n 

W = (W U ..., W k ) T = Y,Xi = Y^(Xa,X i2 , . . . , X ik ) 

i=l i=l 

where {Xi : i £ [n]} are independent such that EXi = for each i and EWW 
hxk- Then, 

sup \¥{W e A) - F(Z e A)\ < 115fc 1/2 7 (3.13) 
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where A is the set of all the convex sets in R fc , Z is a standard k- dimensional 
Gaussian vector and 7 = X)"=i Tf = SlLi E|Xi| 3 . 

Proof. Without loss of generality, assume 7 is finite. Let f e be the solution to 
the Stein equation (3.1) with test function h e defined in (3.4) where h = 1^4-, 
for some A e A. With = W - Xi. we have 

EA f e (W) — EW ■ V/ e (W) 

n 

= EA/ € (W) -J^EX, • (V/ e (W0 - V/ e (WW)) 

2=1 

n 

= EA/ 6 (W) - ^EX, • (Hess/ e (W^))X i ) (3.14) 
i=i 

n 

- ]T EX, ■ (Vf e {W) ~ V/ e (W«) - Hess/ e (WW)X0 
j=i 

= R\ — R.2 



where 



and 



n k 

^ = E E EX y X^E[^ y / e (W) - 9^/e(^ (i) )] (3.15) 
»=i j,j'=i 



n 

R i = E E E ^'^' + ^) - %'/ e (W«)] (3.16) 

i=l j,j'=l 

where J7 is an independent uniform random variable in [0,1]. From (3.3), i?2 
can be expressed as 

R 2 = y Y EXijXij. [ (~) [ [K{VT~sW^ + a/T - lUXi + yfiz) 

- h e (VT=8WW + <Jsz)]d jr <l>(z)dzds 

+ E E EXijXij' / 7T7= / [d r h e {VT^~sW^ + VT~sUX, + ^7sz) 
<=i m' =1 Jo V* Jr* 

- d i ^ e ( % /l — lw {l) + y/szfldrfWdzds 

= i?2,l + R2,2- 

Introducing another independent uniform random variable U' in [0, 1] and using 
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the integration by parts formula, 

k 



R*' 1 = E E WJX^X.yX^n / 2^3// 

x / ft e (VT^sWW + VT^sUU'Xi + y/sz)dj j , jl «f>{z)dzds 



and 
R2.2 



J2 E wjx*Xii>Xii» J ^-7^ 

i=lj,j',j"=l V ' 

x / 9 jV ,/i e (vT^s^W +y/T=HuU'X i + y /sz)d j <f>{z)dzds 

n k „ e 2 p. 

EE—,/ ^ 

2 — 1 J — 1 



We first use the concentration inequality in Proposition 2.7 to bound i?2,2- 
Define any linear transform of a set to be the image of the linear transform of 
all the elements in the set. Notice that by (3.7) and Proposition 2.7, 

k 

\E u ' u '' Xi (^2 Xifdj'VheiVT^lwW + ^z + VT^lUU'Xi) ■ Xi)\ 
< ^-iX^E^'^IiVT^W^ € A C \A - (^z + VT~sUU'X % )) 



< |X,| 2 (32.8fc 1/2 j +312fc 1 / 2 ^). 

eVl - 8 e 



Therefore, 



x f \y2 x ijdj<t>(z)\dzds ( 3 - 17 ) 

< W- 7 (16.4fc 1/2 + 156fc 1/2 ^) 
V 7r e 

where we used Lemma 3.3. Next, we make use of the concentration inequality in 

Proposition 2.7 to bound i?2,i by a quantity involving 7, e and sup^ g ^ |P(W G 

A) — P(Z € j4)|. Write i?2,i = -^2 1 + ^2 1 by separating the sum over i into two 
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parts according to 7, < 87 s or else. Write R' 2 \ = R' 2 1 1 + R'2 1 2 by subtracting 
a term with replaced by an independent fc-dimcnsional standard Gaussian 
vector Z and adding the same term, i.e., 



-^2,1,1 

= E E wxijXii'Xv" f 

i:7i<87 3 j,j'j"=l 



13 13 lJ J e2 2 S 3/ 2 

x / [h c {VT~^ ~sW {l) + s/sz + VT^ ~sUU'X t ) 



- h € {y/l - sZ + sfsz + \/l - sUU'Xi)]djj>j><tj>(z)dzds 

and 

;:7i<87 3 3d' J'' 



k „i 

#2,1,2 = E E ^UXijXij'Xij,, 

i:7i<87 3 jd'd"=l c2 



2s 3 / 2 



X 



/ fr e (VT^sZ + v/Iz + y/T^sUU'Xi)d jrjl/ (f>(z)dzds. 



By introducing an independent copy Xj of Xj, W = + Xi has the same 
distribution as W and is independent of Xj. We have 



< E u < u '' x 



h e (\f\^~sZ + y/sz + v 7 ! - ~sUU'Xi)} 

\l(W (l) G * (A 47+£ - Viz - a/I - lUU'Xi)) 

^ s/l — s 

- J(Z G ; 1 (A 47 - yfc ~ sf/C/'Xi))) 

yl — s J 

< E y ' u ''* + Xi G (^i=(yl 47+c - vsz - VT^C/C/'X,)) 1 ^ 1 

I V ^ ~ s 

\-=^(A 47+e - v^z - v^Tf/C/'XO] 
yl — s 

+ i(z g ; 1 (A 47+e - vfo - VT^aUVxA 

V 1 - s 

\-=^(A 47 - Viz - vT^c/c/%)) 
VI — s 

+ I(W G ; 1 (A 47+£ - Vs^ - VT^sUU'Xi)) 
VI - s 

- J(Z G — ^(A 47+e - vsz - VT^UU'Xi))]. 
VI - s J 

Let 5 7 denote the supreme of sup^g^ \P(W 6 A) - P(Z G A)| over all W such 
that W can be expressed as sum of n independent mean random vectors 
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such that Cov(W,W) = Ikxk and the sum of absolute third moments of the 
summands is bounded by 7. Using the concentration inequalities in Proposition 
2.5 and Proposition 2.7, we have 

E U,U'.X, [ h ^^/l— sW ^ + ^ z + y/T=~sUU'Xi) 



-heiVT^lZ + ^z + Vl^UU'Xi)] (3.18) 

< 4.1fc 1 / 2 E|X J | + 39fc 1/2 7 + fc 1/2 _f + <L. 

V 1 - s 

After proving a lower bound in same way as proving the upper bound, we can 
use Lemma 3.3 to bound R' 2 1 1 by 

\R' 2 1 ,\ < — A t-" 2 (47.2fc 1 /27 fc i/2 &t } y E|Xi|3 _ 



For R' 2 1 2, using the integration by parts formula and noticing that \f\ — sZ + 
\/sZ has the same distribution as Z where Z is an independent copy of standard 
normal Z ' 



1 



E x * / V 2 ^ 3/2 S / /i e (Vl - + y/Hz + \/I - sUU'XiJdjj'jn^dzds 



?x, 



[ ^ l S [ djj'fheiVT^sZ + y/sz + \fl^sUU' Xi)(j)(z)dzds 
Jt 2 2 J Rk 



h e {z + VT^sUU'Xi)d jj/fl (t)(z)dzds. 



2 

Therefore, by Lemma 3.3, 

\R> \<^—=- £ ELY,| 3 . (3.19) 

3V27T 

z:7i<87 

We remark that in the above calculation we used the third derivatives of h e which 
does not exist. However, we can smooth h e first then use limiting arguments to 
show that the final equality holds even if h e does not have third derivatives. 
Now we turn to bounding |-R 2 ' ll where 



R' 2 \i = ^ ^ EUXijXij>Xij" I 

i:7;>87 3 .7..7'..7'"=1 



1 VT~s 

2s 3 / 2 



i:7i>87 3 3,3 ,3 

x [ h t {VT~sW (l) + s/T^sUU'Xi + s fsz)d jrjll ^{z)dzds. 

For each Xi such that 7^ > 87 s , define iV, to be the positive square root of the 
inverse of the matrix Ikxk — Cov(Xi, Xi). Then we have the following bound on 
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the operator norm of Ni . 

-1/2 



= ( ) 1/2 = ( ) 1/2 

1 - su P |„| =1 u'Cav(Xi, X,)u J K 1 - sup ]u | =1 Eiu'X,) 2 ' 



(3.20) 



Note that 



NiW® = J2 N * X * (3.21) 
is a sum of n independent random vectors (with one 0-vector) with 

1^X^=0, Cov(N i W^,N i W^)=I k xk (3-22) 

and 

\- m AT v ,3 ^ 7^7* . 7-7» . 7 - 7j . 

where we used the fact that 7, > 87 s in the last inequality. Therefore, NiW^' 
can be regarded as a standardized sum of n independent random vectors with 
sum of absolute third moments of the summands less than 7. We write R'2 1 into 
two parts as 

nil 

2,1,1 

= E E wXiiXii'Xa" [l^r 

i:7»>87 3 j,j',j"=l c 

x / [^(vT^iVr^iVi^^ + ^ + vT^sC/^'Xi) 



and 



M^l - siVr 1 ^ + Tsz + v 7 ! - lUU'Xi)]d jrjl ,<f){z)dzds 

- 1 x/T^7 



^2,1,2 — E E ^UXijXij'Xij" \ 

i:7i>87 3 j,j',j"=l 
X 



2s 3 / 2 



/ /i^x/l - siV 1 ^ + + VT^sUU'Xi)d jffr (j)(z)dzds. 
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From 

E u ' u ' ' ^[h t {VY~sNr\N. l W { - €> ) + V~sz + VlSlJU'X,) 



- hciVT^lN^Z + y/sz + Vl~sUU'X l )] 

< E u > u '' Xl [I(NiW® € Nl (^ 47+£ -y/az- VT~sUU'X t )) 
V 1 - s 

- i(z e -^=(a 4 t - yfo - VT^c/c/'^))] 

VI -s 

vi — s 

and a similar lower bound, we have 

i:7i>87 3 v V x I 



Therefore, 



l + 4e- 3 / 2 A , 1 



|i? 2 , M | < 7 (^1 + fc 1/2 ^= + 47.2fc 1 /2l). (3.24) 

Using a similar argument leading to (3.19), i? 2 ' i 2 can be written as 

#2,1,2= X] X! WXijXijtXijn f 

<:74>87 3 J.J'.J'"=1 2 (3.25) 

x / h e (Z + VT^ r sUU'X i )d jfj , l (t>j; t (z)dz 

where S| = Ikxk~ (1 — s)Cov(Xj, Xj) and is the density function of N(0, E|). 
From 

/ I V] XijXij>Xij»djj'j»^(z)\dz 

= / I E {NiX i ) i (NiX i )y(N!X i ) i nd ii . i »<l>{z)\dz 
where iV? is the positive square root of the inverse of S| , 

i^x, 2 i< E e i^i 3 ^^(t4273 )3/2 - (3 - 26) 

We used the fact that \\N?\\ < ( 1 _ 1 2/3 ) 1 / 2 , which can be proved as in (3.20), in 
the above inequality. Therefore, 

1 4- /Lp~ 3 / 2 1 
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Observing that R\ can be written as 

n k 

^ = E E ^x t] x ir [d jr uw)~d n ,uw^)] 
i=i j,j'=i 

where Xi is an independent copy of Xi, we can bound it similarly as for R2 as 
follows. 

I-Ri a| < 2\ - 7 (16.4fc 1/2 + 156fc 1/2 ^), (3.28) 
V 7r e 

|i? 1)M | < 2(1 + J*M 6 -1 + 1 + 47.2^/22), (3.29) 

V 7r e -/1 — ^2/3 e 

i^i<|a+Vl^r^73) 3/2 ' ( 3 - 3 °) 

Note that the constants are different from those of R2 because we use (3.11) 
instead of (3.10) and an extra 2 comes from the fact that there is no U in 
From the bounds (3.29), (3.30), (3.28), (3.24), (3.27), (3.17) and the smoothing 
inequality (3.8), with c = 2(1 + J^) 



2\ 1 l+4e~ 3/2 



'2 71 



(1 _ T± )Si < (49 . 2 Ji + + « )t ^ 7 



V TT ^1 _ 7 2/3 3(l- 7 2 /3)3/2 

-+47.2c )fc 1/2 ^+A; 1 / 2 (47 + e). 
7r e 

Let e = 337, and without loss of generality let 7 < 1/115. The bound (3.13) is 
proved by solving the above inequality. 

□ 



4. Proofs of lemmas 

We prove Lemma 2.1 to 2.4 in this section. 

Proof of Lemma 2.1. The lemma is true by observing that for x G M. k \A e , 
Xq must be the nearest point of x\ in A where xo, x\ as defined above Lemma 
2.1. 

Proof of Lemma 2.2. Because xo, the nearist point in A from x, depends 
on x, the validity of (2.2) is not obvious. We consider the following three cases. 
All the other cases can be reduced to these cases. 

Case 1: <r\ G A, 77 + £ G A. 

Case 2: rj G A e \A, r, + £ G A e \A. 
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Case 3: r) G R fe \,4 £ , + R k \A e . 

In case 1, since f{rf) = f(r] + £) = 0, (2.2) is satisfied. 

From the facts that (2.2) is equivalent to 

(-0 ■ (f(v + € + (-0) - f(v + 0) > o (4.1) 

and 

e-(^-7 ?0 )>0 implies (_£).((,, + £)_(,, + f) )<0, (4.2) 

which can be proved using a similar argument as in the next paragraph, we only 
need to consider the following situation in case 2. 

Assume £ • (77 — 770) < 0. Let p\ be the plane containing points 770,77,?] + £. 
Let the point (77 + £)' be on pi such that (77 + £)' — (7/ + £) is parallel to 770 — 77 
and (77 + £)' — 770 is parallel to £. Let P2 be the (k — l)-dimensional hyperplane 
orthogonal to £ and containing (77 + £)'. The hyperplane pi divides into two 
parts Si, S2 where si is closed and contains 77. If (77 + £)o, the nearest point in 
A from 77 + £, is in si, (2.2) is satisfied. If not, let (77 + £)" be the projection of 
(77 + £)q on px- Then the angle between 770 — (7/ + £)" and 7/ + £ — (7/ + £)" is less 
than 7r/2. This means that the angle between 7/0 — ( r / + £)o an( l 77 + ^— (»7 + £)o is 
less than ir/2, which contradicts with the fact that (77 + £)o is the nearest point 
in A from 7/ + £. 

The validity of (2.2) in case 3 can be proved similarly. 

Proof of Lemma 2.3. We first prove /j is 1-Lipschitz in direction i. From 
(4.2), we only need to prove 

\fi(x + hei) - fi(x)\ < h, h>0 (4.3) 

in the following two cases. 

Case 1: x, x + hei G A e \A and ej • (x — xq) < 0. 
Case 2: x,x + hei 4- ^ an d e, ■ [x — xq) < 0. 

For case 1, let p\ be the plane parallel to x — xq, e% and containing x. Let (x + 
hei)' be onpi such that (x+hei)' — (x+hei) is parallel to x— xq and (x+hei)' — xq 
is parallel to e^. Let P2 be the (fc — l)-dimcnsional hyperplane orthogonal to 
and containing (x + hei)' , and let 753 be the (k — l)-dimensional hyperplane 
orthogonal to x — xq and containing xq. Let (x + hei)" be the projection of 
x + he^ on p% and, let x' be the intersection of the line {xq + t(x — xq) : t G R} 
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with p2- Then, (x + hei)' , the projection of (x + hei)o on pi, must be within the 

trapezoid {xq,x', (x + hei)', (x + hei)"} (including the boundary), which implies 

h > fi(x + hej) — fi(x) > 0. Therefore, (4.3) is satisfied. Case 2 is similar. 

Since fi is 1-Lipschitz in direction i, difi exist a.e.. From Lemma 2.2, 

fijx + hej) - fj(x) _ (hej) ■ (f(x + hej) - ,f (x)) 
h h 2 

Therefore, 



>0,V/i£R,/i^0. 



n t / \ ,• fi(x + hei)-fi(x) 

Oifdx) = hm > a.e. 

/i-s-o h 

Proof of Lemma 2.4. If 8i = 0, fi(x) = x — xq = Xi — x^i- Note that xo does 
not change by moving x a little in the direction of e^. So difi(x) = 1 = cos 2 6i. 
If 9i = 7r/2, Lemma 2.4 follows from Lemma 2.3. 

If < 8i < 7r/2 and h > small enough such that x + he t G (A C )°\A Let 
Pi be the (fc — l)-dimensional hyperplane orthogonal to x — xq which contains 
xq. Let (x + hei)' be the projection of x + hei on p\. Let pi be the (k — 1)- 
dimensional hyperplane orthogonal to x$ — (x + he-i) 1 which contains (x + he if ' . 
The hyperplane p\ divides R fc into two parts Si, s-i where S2 is open and contains 
x; the hyperplane pi divides R fe into two parts S3,S4 where S3 is closed and 
contains x. By observing 

(x + he. t - (x + he t Y) ■ e A = f t {x) + cos 2 Oih 

and (x + hei)o must be in si n S3, we have, 

fi{x + he-i) > (x + hei — (x + hei) 1 ) ■ — f%(x) + cos 2 Oih. 

This implies 

M^+M^M > C0S 2^, (4 . 4) 



Therefore, 



U(x + hei) — fi(x) 9 „ 

lim ^ ^ > cos 2 0i a.e. 

/i^0+ ft 



So difi(x) > cos 2 0i a.e. . For the other possible choices of the arguments are 
similar. This completes the proof of Lemma 2.4. 
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